信仰传播(BP)是针对图形模型的各种推理任务的重要消息算法,包括解决约束优化问题(COPS)。已经表明,BP可以通过在发送新消息(即抑制作用)之前将旧消息和新消息混合在各种基准测试中实现最先进的性能。但是,现有的调整BP静态阻尼因子的方法不仅在费力,而且损害其性能。此外,现有的BP算法在撰写新消息时平均处理每个变量节点的邻居,这也限制了其探索能力。为了解决这些问题,我们无缝地集成了BP,封闭式复发单元(GRU)和图形注意网络(GATS),以推理构成新的BP消息的动态权重和阻尼因子,以推理有关动态权重和阻尼因子。我们的模型,深切的信念传播(DABP),将因子图和每次迭代中的BP消息作为输入,并通过GRUS和GATs渗透最佳权重和阻尼因子,然后是多头注意力层。此外,与现有的基于神经的BP变体不同,我们提出了一种新颖的DABP的自我监督学习算法,其解决方案成本不需要昂贵的培训标签,并且还可以通过有效的在线学习避免常见的分发问题。广泛的实验表明,我们的模型大大优于最先进的基线。
translated by 谷歌翻译
分布式约束优化问题(DCOPS)是组合优化问题的重要子类,其中信息和控件分布在多个自主代理中。此前,通过学习有效启发式,机器学习(ML)基本上应用于解决组合优化问题。然而,现有的基于ML的启发式方法通常不完全到不同的搜索算法。最重要的是,这些方法通常需要全面了解要解决的问题,这不适合分布式设置,其中由于地理限制或隐私问题,集中化并不逼真。为了解决一般性问题,我们提出了一种用于DCOPS的新型针对性的非循环图表示模式,并利用图表注意网络(GATS)来嵌入图形表示。我们的模型GAT-PCM,然后以离线方式使用最佳标记的数据来预先预订,以构建有效的启发式,以提高广泛的DCOP算法,其中评估部分分配的质量至关重要,例如本地搜索或回溯搜索。此外,为了实现分散的模型推断,我们提出了一个GAT-PCM的分布式嵌入式模式,其中每个代理只交换嵌入的向量,并显示其声音和复杂性。最后,我们通过将其与本地搜索或回溯搜索算法组合来展示我们模型的有效性。广泛的经验评估表明,GAT-PCM升级算法显着优于各种基准中的最先进的方法。预磨料模型可在https://github.com/dyc941126/gat-pcm上获得。
translated by 谷歌翻译
多标签分类(MLC)是一个预测任务,其中每个样本可以具有多个标签。我们提出了一种基于高斯混合变分性AutoEncoder(C-GMVAE)的新型对比度学习促进的多标签预测模型,其学习多模式现有空间并采用对比损耗。除了预测模块之外,许多现有方法引入了额外的复杂神经模块以捕获标签相关性。我们发现,通过在监督环境中使用对比学习,我们可以有效利用标签信息,并学习有意义的功能和标签嵌入,捕获标签相关性和预测功率,而无需额外的神经模块。我们的方法还采用了学习和对齐功能和标签的潜在空间的想法。 C-GMVAE对潜伏空间的高斯混合结构施加了高斯混合结构,以减轻后塌陷和过正规的问题,与先前的单峰的作品相比。 C-GMVAE优先于多个公共数据集上的现有方法,通常可以匹配其他模型的完整性能,只有50%的训练数据。此外,我们表明学习的嵌入提供了对标签标签交互的解释的见解。
translated by 谷歌翻译
Consensus clustering aggregates partitions in order to find a better fit by reconciling clustering results from different sources/executions. In practice, there exist noise and outliers in clustering task, which, however, may significantly degrade the performance. To address this issue, we propose a novel algorithm -- robust consensus clustering that can find common ground truth among experts' opinions, which tends to be minimally affected by the bias caused by the outliers. In particular, we formalize the robust consensus clustering problem as a constraint optimization problem, and then derive an effective algorithm upon alternating direction method of multipliers (ADMM) with rigorous convergence guarantee. Our method outperforms the baselines on benchmarks. We apply the proposed method to the real-world advertising campaign segmentation and forecasting tasks using the proposed consensus clustering results based on the similarity computed via Kolmogorov-Smirnov Statistics. The accurate clustering result is helpful for building the advertiser profiles so as to perform the forecasting.
translated by 谷歌翻译
In computational advertising, a challenging problem is how to recommend the bid for advertisers to achieve the best return on investment (ROI) given budget constraint. This paper presents a bid recommendation scenario that discovers the concavity changes in click prediction curves. The recommended bid is derived based on the turning point from significant increase (i.e. concave downward) to slow increase (convex upward). Parametric learning based method is applied by solving the corresponding constraint optimization problem. Empirical studies on real-world advertising scenarios clearly demonstrate the performance gains for business metrics (including revenue increase, click increase and advertiser ROI increase).
translated by 谷歌翻译
In cost-per-click (CPC) or cost-per-impression (CPM) advertising campaigns, advertisers always run the risk of spending the budget without getting enough conversions. Moreover, the bidding on advertising inventory has few connections with propensity one that can reach to target cost-per-acquisition (tCPA) goals. To address this problem, this paper presents a bid optimization scenario to achieve the desired tCPA goals for advertisers. In particular, we build the optimization engine to make a decision by solving the rigorously formalized constrained optimization problem, which leverages the bid landscape model learned from rich historical auction data using non-parametric learning. The proposed model can naturally recommend the bid that meets the advertisers' expectations by making inference over advertisers' historical auction behaviors, which essentially deals with the data challenges commonly faced by bid landscape modeling: incomplete logs in auctions, and uncertainty due to the variation and fluctuations in advertising bidding behaviors. The bid optimization model outperforms the baseline methods on real-world campaigns, and has been applied into a wide range of scenarios for performance improvement and revenue liftup.
translated by 谷歌翻译
We propose a new neural network design paradigm Reversible Column Network (RevCol). The main body of RevCol is composed of multiple copies of subnetworks, named columns respectively, between which multi-level reversible connections are employed. Such architectural scheme attributes RevCol very different behavior from conventional networks: during forward propagation, features in RevCol are learned to be gradually disentangled when passing through each column, whose total information is maintained rather than compressed or discarded as other network does. Our experiments suggest that CNN-style RevCol models can achieve very competitive performances on multiple computer vision tasks such as image classification, object detection and semantic segmentation, especially with large parameter budget and large dataset. For example, after ImageNet-22K pre-training, RevCol-XL obtains 88.2% ImageNet-1K accuracy. Given more pre-training data, our largest model RevCol-H reaches 90.0% on ImageNet-1K, 63.8% APbox on COCO detection minival set, 61.0% mIoU on ADE20k segmentation. To our knowledge, it is the best COCO detection and ADE20k segmentation result among pure (static) CNN models. Moreover, as a general macro architecture fashion, RevCol can also be introduced into transformers or other neural networks, which is demonstrated to improve the performances in both computer vision and NLP tasks. We release code and models at https://github.com/megvii-research/RevCol
translated by 谷歌翻译
We address the theoretical and practical problems related to the trajectory generation and tracking control of tail-sitter UAVs. Theoretically, we focus on the differential flatness property with full exploitation of actual UAV aerodynamic models, which lays a foundation for generating dynamically feasible trajectory and achieving high-performance tracking control. We have found that a tail-sitter is differentially flat with accurate aerodynamic models within the entire flight envelope, by specifying coordinate flight condition and choosing the vehicle position as the flat output. This fundamental property allows us to fully exploit the high-fidelity aerodynamic models in the trajectory planning and tracking control to achieve accurate tail-sitter flights. Particularly, an optimization-based trajectory planner for tail-sitters is proposed to design high-quality, smooth trajectories with consideration of kinodynamic constraints, singularity-free constraints and actuator saturation. The planned trajectory of flat output is transformed to state trajectory in real-time with consideration of wind in environments. To track the state trajectory, a global, singularity-free, and minimally-parameterized on-manifold MPC is developed, which fully leverages the accurate aerodynamic model to achieve high-accuracy trajectory tracking within the whole flight envelope. The effectiveness of the proposed framework is demonstrated through extensive real-world experiments in both indoor and outdoor field tests, including agile SE(3) flight through consecutive narrow windows requiring specific attitude and with speed up to 10m/s, typical tail-sitter maneuvers (transition, level flight and loiter) with speed up to 20m/s, and extremely aggressive aerobatic maneuvers (Wingover, Loop, Vertical Eight and Cuban Eight) with acceleration up to 2.5g.
translated by 谷歌翻译
Traditional multilingual neural machine translation (MNMT) uses a single model to translate all directions. However, with the increasing scale of language pairs, simply using a single model for massive MNMT brings new challenges: parameter tension and large computations. In this paper, we revisit multi-way structures by assigning an individual branch for each language (group). Despite being a simple architecture, it is challenging to train de-centralized models due to the lack of constraints to align representations from all languages. We propose a localized training recipe to map different branches into a unified space, resulting in an efficient detachable model, Lego-MT. For a fair comparison, we collect data from OPUS and build the first large-scale open-source translation benchmark covering 7 language-centric data, each containing 445 language pairs. Experiments show that Lego-MT (1.2B) brings gains of more than 4 BLEU while outperforming M2M-100 (12B) (We will public all training data, models, and checkpoints)
translated by 谷歌翻译
Despite the surprising few-shot performance of in-context learning (ICL), it is still a common practice to randomly sample examples to serve as context. This paper advocates a new principle for ICL: self-adaptive in-context learning. The self-adaption mechanism is introduced to help each sample find an in-context example permutation (i.e., selection and ordering) that can derive the correct prediction, thus maximizing performance. To validate the effectiveness of self-adaptive ICL, we propose a general select-then-rank framework and instantiate it with new selection and ranking algorithms. Upon extensive evaluation on eight different NLP datasets, our self-adaptive ICL method achieves a 40% relative improvement over the common practice setting. Further analysis reveals the enormous potential of self-adaptive ICL that it might be able to close the gap between ICL and finetuning given more advanced algorithms. Our code is released to facilitate future research in this area: https://github.com/Shark-NLP/self-adaptive-ICL
translated by 谷歌翻译